Phrase Based Document Retrieving by Combining Suffix Tree index data structure and Boyer- Moore faster string searching algorithm

نویسنده

  • B. Ganga
چکیده

Phrase has been considered as a more informative feature term for improving the effectiveness of document retrieval .This paper propose an Algorithm A Phrase Based Document Retrieval to retrieve the similar documents by combining two exiting algorithm suffix tree ,index data structure and “The Boyer-Moore Algorithm”, faster string searching algorithm. The suffix tree is constructed based on E. Ukkonen, “on-Line Construction Of Suffix Trees For Strings, a most efficient string-matching algorithm. On the constructed suffix ,”The Boyer-Moore Algorithm” is applied to check the presence of pattern i.e. the input phrase in order and without order to retrieve the similar documents. Furthermore, by studying the property of suffix tree and Boyer-Moore, we conclude that suffix tree data structure store huge documents and Boyer-Moore algorithm checks the presence of pattern fastly. This conclusion sufficiently explains why the Phrase Based Document Retrieval works much better than the other document retrieval.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Skriptum VL Text Indexing

In this section we will introduce suffix trees, which, among many other things, can be used to solve the string matching task (find pattern P of length m in a text T of length n in O(n + m) time). We already know that other methods (Boyer-Moore, e.g.) solve this task in the same time. So why do we need suffix trees? The advantage of suffix trees over the other string-matching algorithms (Boyer-...

متن کامل

Skriptum VL Text-Indexierung

In this section we will introduce suffix trees, which, among many other things, can be used to solve the string matching task (find pattern P of length m in a text T of length n in O(n+m) time). In the exercises, we have already seen that other methods (Boyer-Moore, e.g.) solve this task in the same time. So why do we need suffix trees? The advantage of suffix trees over the other string-matchi...

متن کامل

Space-efficient Data Structures for String Searching and Retrieval

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 The Models of Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 Our Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...

متن کامل

A Space-Efficient Implementation of the Good-Suffix Heuristic

We present an efficient variation of the good-suffix heuristic, firstly introduced in the well-known Boyer-Moore algorithm for the exact string matching problem. Our proposed variant uses only constant space, retaining much the same time efficiency of the original rule, as shown by extensive experimentation.

متن کامل

Adapting Boyer-Moore-Like Algorithms for Searching Huffman Encoded Texts

In this paper we propose an efficient approach to the compressed string matching problem on Huffman encoded texts, based on the Boyer-Moore strategy. Once a candidate valid shift has been located, a subsequent verification phase checks whether the shift is codeword aligned by taking advantage of the skeleton tree data structure. Our approach leads to algorithms that exhibit a sublinear behavior...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014